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Abstract 

In the setting of program algebra (PGA) we consider the repeat instruction. This 
special instruction was designed to represent infinite sequences of primitive instructions 
as finite, linear programs. The resulting mathematical structure is a semigroup. We show 
that a kernel of this syntax can replace PGA as a carrier for program algebra by providing 
axioms for defining single-pass congruence and structural congruence, and equations for 
thread extraction. Finally, we discuss the related program notation PGLA that serves as 
a basis for PGA's tool set. 



1 Introduction 

This paper is a thoroughly revised version of [5] . A "program" is a piece of data for which the 
preferred or natural interpretation (or meaning) is a sequence of primitive instructions (SPI), 
and we say that a program produces such a sequence. The execution of a SPI is single-pass: it 
starts with executing the first primitive instruction, and each primitive instruction is dropped 
after it has been executed or jumped over. 

We work in the setting of Program Algebra (PGA) as laid out in [3], a setting that provides 
an algebraic framework and corresponding semantic foundations for sequential programming. 
A very basic question is how to represent SPIs. PGA was designed to give a very straightfor- 
ward and simple answer to this question, starting with constants for primitive instructions and 
two operations for composing SPIs: concatenation and repetition. Each primitive instruction 
is a finite SPI, and if P and Q are finite SPIs, then so is their concatenation P; Q. An infinite 
SPI is periodic if it can be written as P; Q u with P and Q finite SPIs. Here Q u is the repeti- 
tion of Q, i.e. the SPI that repeats Q infinitely often. PGA has a complete axiomatization for 
single-pass congruence of finite and periodic SPIs, which is the congruence that characterizes 
extensional equality of SPIs, i.e., the equality defined by having the same primitive instruction 
at each position. For finite and periodic SPIs, single-pass congruence is decidable. 

In this paper (Section [3j we characterize the class of SPIs expressible in PGA without 
making use of the repetition operator: each periodic SPI can be represented as a finite sequence 
of instructions with help of the repeat instruction. The price to be paid is that such a finite 
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sequence contains a non-primitive repeat instruction and that we lose the property that each 
sequence of instructions is a (proper) program. The repeat instructions or so-called repeaters 
are defined as follows: for n a natural number larger than 0, 

\\#n 

prescribes to repeat the last preceding n instructions. Repeaters do not comply with the 
notion of single pass execution. For example, for u a primitive instruction, 

u; abbreviates the SPI u; u; u; u; 

Thus, u; is a sequence of two instructions that represents a periodic SPI: the infinite 
sequence of it's, in PGA's notation u u , and the same SPI is represented by and 
u; u; \\#2 and u; it; u; \\#2, and by many more finite sequences. We write L for the set of all 
finite sequences built from primitive instructions and repeaters with concatenation. 

We also discuss the fact that not each sequence of instructions in L can be called a program: 
for example, which SPI, if any, is meant by 

«;\\#2 

or by a single repeat instruction? We distinguish a subset K of L containing those sequences 
that represent the finite and periodic SPIs. All sequences in K can be called "programs". For 
K, we provide axioms for single-pass congruence and for structural congruence, a congruence 
that admits the chaining of jump counters. The question whether we should deal at all with 
L-sequences with "too large" repeat counters, thus sequences not in K such as it; \\#2, is 
briefly discussed in Section 2] 

2 PGA basics 

In this section some basic information about PGA (based on [3]) is recalled. Furthermore, 
we briefly discuss Thread Algebra (cf. [7]), earlier described in e.g. [U [3] under the name 
Polarized Process Algebra. 

2.1 PGA, primitive instructions and SPIs 

Assume A is a set of constants with typical elements a, b, c, . . .. PGA-programs are of the 
following form (k G N): 

P ::= a | +a | -a | #fc | ! | P;P \ . 

Each of the first five forms above is called a primitive instruction: 

• A basic instruction a is a prescription for a piece of behavior that is considered indivisible 
and executable in finite time. 
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• A basic instruction can be turned into a test instruction by prefixing it with either a + 
(positive test instruction) or a — (negative test instruction), thus typically +a, — b etc. 
Test instructions control subsequent execution via the result of their execution (which 
is a Boolean reply). 

• A next kind of primitive instruction is the jump instructions #fc: this instruction pre- 
scribes to jump k instructions ahead (if possible; otherwise deadlock occurs) and generate 
no observable behavior. 

• Finally, the termination instruction ! prescribes sucessful termination, an event that is 
taken to be observable. 

We write U for the set of primitive instructions and we define each element of U to be a SPI 
(Sequence of Primitive Instructions). 

Finite SPIs are defined using concatenation: if P and Q are SPIs, then so is 

P\Q 

which is the SPI that lists Q's primitive instructions right after those of P, and we take 
concatenation to be an associative operator. 

Periodic SPIs are defied using the repetition operator: if P is a SPI, then 

is the SPI that repeats P forever, thus P;P;P; . . .. Typical identities that relate repetition 
and concatenation of SPIs are 

{P;Py = P" and (P; Q) u = P; (Q; P)" . 

Another typical identity is P u ; Q = P u , expressing that nothing "can follow" an infinite 
repetition. 

As mentioned before, the execution of a SPI is single-pass: it starts with the first (left- 
most) instruction, and each instruction is dropped after it has been executed or jumped over. 
In Section 12.31 the precise meaning of primitive instructions in terms of their execution is 
explained. The representation of a finite or periodic SPI in PGA is henceforth called a "PGA- 
program" . 

2.2 Canonical forms and two congruences in PGA 

In PGA, different types of equality are discerned, the most simple of which is single-pass 
congruence, identifying PGA-programs that execute identical SPIsQ For PGA-programs not 
containing repetition, single-pass congruence boils down to the associativity of concatenation, 
and is thus axiomatized by 

(X;Y);Z = X;(Y;Z). 

From this point onwards we leave out brackets in repeated concatenations. Define X 1 = X 
and for n > 0, X n+1 = X; X n . According to [3J, single-pass congruence for PGA-programs is 
axiomatized by the axioms (schemes) PGA1-PGA4 in Table [1] 

1 Although a bit long, primitive instruction sequence congruence would also be an adequate name. 
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(A;F);Z 


= * ; (y ; z) 


(PGA1) 






= X" 


(PGA2) 




A";F 


= X" 


(PGA3) 




(x-.yy 


= X;(Y;Xr 


(PGA4) 




. .;u„;#0 


= #0;ui; ■ ■ .;u n ; #0 


(PGA5) 


#n+l;ui;.. 


■ ; u n ; #m 


= #n+m+l;ui; . . . ; u„; #m 


(PGA6) 


(#fc+n+l;wi 


, . . . , it n ) 


= (#fc;wi; . . 


(PGA7) 


m; . . . ; it„; («i; . . 




-> #n+m+fc+2; X = #n+fc+l; JC 


(PGA8) 



Table 1: PGA-axioms for structural congruence, where fc,n, m £ N, Ui,Vj range over the 
primitive instructions, and u±; . . . ; uo; represents the empty sequence 

Proposition 1. The unfolding law 

I u = x-x^ 

follows from the axioms PGA 2 and PGA4- 

Proof. Straightforward: X u = {X; Xy = X; {X; Xy = X; X" . □ 
Whenever two PGA-programs X and Y are single-pass congruent, this is written 
X S p C Y. 

The subscript spc will be dropped if no confusion can arise. Using the axioms PGA1-PGA4 
(thus preserving single-pass congruence), each PGA-program can be rewritten into one of the 
following forms: 

Y not containing repetition, or 

Y; Z u with Y and Z not containing repetition. 

Any PGA-program in one of the two above forms is said to be in first canonical form. Moreover, 
in the case of Y\ Z u , there is a unique first canonical form if the number of instructions in both 
Y and Z is minimized (using PGA1-PGA4). Single-pass congruence is decidable (as recorded 
in 0). 

PGA-programs in first canonical form can be converted into second canonical form: a 
first canonical form in which no chained jumps occur, i.e., jumps to jump instructions (apart 
from #0), and in which each non-chaining jump into the repeating part is minimized. The 

2 Conversely, from unfolding and the conditional proof rule X = Y;X => X = Y u , one derives PGA2-4. 
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associated congruence is called structural congruence and is axiomatized in Table [T] Note that 
axiom PGA8 is an equational axiom, the implication is only used to enhance readability. 

We write X — sc Y if X and Y are structurally congruent, and drop the subscript if no 
confusion can arise. Two examples, of which the right-hand sides are in second canonical form: 

#2; a; (#5; b; +c) u =sc #4; a; (#2; b; +c) u , 
+a; #2; (+b; #2; -c; #2)" = sc +a; #0; (+b; #0; -c; #0)". 

For each PGA-program there exists a structurally equivalent second canonical form. More- 
over, in the case of Y\ Z w this form is unique if the number of instructions in Y and Z is 
minimized. As a consequence, structural congruence is decidable. In the first example above, 
#4; a; (#2;b; +c) w is the unique minimal second canonical form; for the second example it is 

+a; (#0;+b;#0;-cr. 

For more information on PGA we refer to [31 [7]. 

2.3 Thread algebra: behavioral semantics for PGA 

We briefly discuss thread algebra. Threads model the execution of SPIs. Finite threads are 
defined inductively: 

S — stop, the termination thread, 
D — inaction or deadlock, the inactive thread, 
P < a > Q — the postconditional composition of P and Q for action a, 
where P and Q are finite threads and a € A 

The behavior of the thread P < a > Q starts with the action a and continues as P upon reply 
true to a, and as Q upon reply false. Note that finite threads always end in S or D. We use 
action prefix a o P as an abbreviation for P < a > P and take o to bind strongest. 

Upon its execution, a basic or test instruction yields the equally named action in a post 
conditional composition. Thread extraction on PGA, notation 

1*1 

with X a PGA-program, is defined by the thirteen equations in Table [2] In particular, note 
that upon the execution of a positive test instruction +a, the reply true to a prescribes to 
continue with the next instruction and false to skip the next instruction and to continue with 
the instruction thereafter; if no such instruction is available, deadlock occurs. For the execution 
of a negative test instruction —a, subsequent execution is prescribed by the complementary 
replies. 

For a PGA-program in second canonical form, these equations either yield a finite thread, 
or a so-called regular thread, i.e., a finite state thread in which infinite paths can occur. Each 
regular thread can be specified (defined) by a finite number of recursive equations. As a first 
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HI 
l ! l 


— c 

— o 


\ii j 


li- y\ 


— c 

— o 


(Hi) 


|a| 


= ao D 


(iv) 


|a;X| 


= ao \X\ 


(v) 


l+a| 


= ao D 


(vi) 


|+a;X| 


= 1^1 < a> |#2;X| 


(vii) 


|-a| 


= ao D, 


(via) 


|-a;X| 


= |#2;X|<]a>|X| 


(ix) 




= D 


(x) 


|#0;X| 


= D 








(xi) 




= \x\ 








(xii) 


\*k+2-u\ 


= D 








(xiii) 


\#k+2;u;X\ 





Tabic 2: Equations for thread extraction, where a ranges over the basic instructions, and u 
over the primitive instructions (k e N) 



example, the regular thread Q specified by 
Q = aoP 

R = coR<bf>(S<d>Q) 
and Q can be defined by |a; (+b; #2; #3; c; #4; +d; !; a) w |. A picture of this thread: 



r 



R: <b) 



[c] (d) 



where [ a ] ~ a o P 



and (a) ~ p <a>P r . 



Some more examples: 



a;#3| = a o D, 



|+a;#3;(#0)" 
|#4;a; (#2;b;+c)* 



| + a; #0; (#0) w | = a o D, 

|(+c;#2;b;)"| = P with P = P < c > b o P, 
+ a; #0; (b; #0; -c; #0) w | = D<a>boD, 

- a; #0; (+b; #0; -c; #0) w | = D < a > P with P = D < b > (P < c > D). 
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It can be inferred that 

\ui, . . . ; u n \ = |ui;...;u„;(#0) w | 

for Ui ranging over the primitive instructions. We shall use a°° as an informal notation for 
the thread defined by |a"|. 

For basic information on thread algebra we refer to [2j [7]; more advanced matters, such as 
an operational semantics for thread algebra, are discussed in [4] . We here only mention the fact 
that each regular thread can be specified in PGA, and, conversely, that each PGA-program 
defines a regular thread. 

3 An instruction semigroup with repeaters 

In this section we introduce an instruction sequence semigroup with repeaters. We distinguish 
a kernel K of this semigroup that can serve as a carrier for program algebra: single-pass 
congruence, structural congruence and thread extraction can all be defined in K without 
reference to PGA. 

3.1 A semigroup L with canonical forms 

We introduce a semigroup L with concatenation as its (associative) operation, starting from 
the set IA of primitive instructions and so-called repeaters (also called repeat instructions) 

\\#" 

for all n G N+ = N \ {0}. 

Brackets are not used in L because we are working in a semigroup, 
itive instructions ending with \\#n will repeat its last n instructions, 
instruction itself. So for n > 0, 

ui;...;w„;\\#n 

with all Ui primitive instructions represents the same SPI as (ui; . . . ; u„)" J does. Instructions 
to the right of a repeat instruction are irrelevant and can be deleted. 

For terms not containing a repeat instruction, single-pass congruence boils down to the as- 
sociativity of concatenation, and is thus axiomatized by axiom (JTJ) in Table [3l We prove below 
that single-pass congruence for all SPIs that can be expressed with repeaters is axiomatized 
by the axioms ©-(HI) in Table [3] (and those of equational logic), and we write 

Lspc 

for this particular proof system. Note that X, Y, Z in axioms (JTJ) and ([3]) range over sequences 
that may contain both primitive and repeat instructions. Although all remaining equations 
in Table [3] are axiom schemes, we shall also call these "axioms" . 

In L spci axiom ([3]) implies that each closed term in L can be equated to one that contains 
at most one repeat instruction. We define first canonical L-forms, a preferred representation 
for closed terms in L. 



A sequence of prim- 
excluding the repeat 
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(X;Y);Z 
ui;...;u„;\\#n 
\\#n;X 

ui; . . . ; w m ; t>i; . . . ; u„; \\#m+n 




(4) 



#fe+l+^ui;...;u fc ;\\#fc+l 



#fc+l;ui;...;u fc ;#0 



#0;ui; . . . ;u fe ;#0 



#fc+m+l; tii; . . . ; Ufe; #m 
#*;ui;...;u fc ;\\#fc+l 



(5) 
(6) 
(7) 



#fc+l+rn+£; m; . . . ; «i; . . . ; u m ; \\#m = #fc+l+£; iti; . . . ; «i; . . . ; u m ; \\#m (8) 

Table 3: Axioms for SPIs, where eN, to, ?i G N + and Mi, G W 

Definition 1. j4n L-term is a first canonical L-form if it is of the form 

ui;...;u n or m; . . . ; u k : 
with Ui G hi, k G N and rt G N + . Here u\; . . . ; Uo; represents the empty sequence. 

For a closed term in L, we say that its first canonical L-form is obtained by applying 
axiom ([3]) to the leftmost occurring repeater if present, and otherwise it is that term itself. 

Not all closed terms in L have an intuitive meaning. For example, 



illustrate this situation. Note that such first canonical L-forms can not be rewritten using any 
of the axioms ©-© in Tabled 

3.2 A kernel K with canonical forms and two congruences 

Let K stand for the subset of closed terms whose first canonical L-form has the property that 
the repeat instruction is preceded by at least n primitive instructions. A first canonical 

L-form in K will henceforth be called a first canonical K-form. 

Definition 2. Let u\; . . . ; u k be a SPI. The first canonical K-form ui; . . . ; Uk is minimal by 

definition, and a first canonical K-form 

ui; . . .;u k ; \\#n 

(so < n < k) is minimal if its repeating part (i.e., Uk- n +\ \ ■ ■ ■ Uk) can not be made smaller 
with axiom scheme ([2]) and its non-repeating part (i.e., U\; . . . ; Uh—n) can not be made smaller 
with axiom scheme |4|). 



a;\\#2 and #7; +a; \\#5 
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Two examples, where the right-hand sides are minimal first canonical if-forms: 

+a; -b; #4; -b; #4; \\#4 = spc +a; -b; #4; \\#2, 
-a; +c; #4; +c; \\#2 = spc -a; +c; #4; \\#2. 

From this point onwards, first canonical if-forms are called K -programs. We state without 
proof that in L spc each if -program can be rewritten into a unique minimal if-program in 
terms of its number of instructions. As a consequence, single-pass congruence is decidable for 
if-programs. Single-pass congruence for if-programs is captured by the next result. 

Theorem 1. Two K -programs P and Q are single-pass congruent if, and only if, 
L S pc \~ P = Q. 

Proof. Trivial: restricting to if, soundness and completeness of the proof system L spc fol- 
low from its direct relation with PGA: the correspondence between u±; . . . ; u n ; and 
(ui; . . . ; u n ) u and the first four axioms of L spc and PGA1-PGA4 (and the corresponding 
fact that minimal first canonical K -forms are unique). □ 

In order to argue that if is a fully fledged alternative for PGA we define second canonical 
if-forms, and we write 



for the extension of L spc with all axiom schemes in Table [3] (thus axioms JU~JE]))- 

Definition 3. A second canonical K-form is a first canonical K-form in which no chained 
jumps occur and in the case of U\; . . . ; u m ; all jumps to it m _ n +i, . . . , u m are minimized 

(cf. SectionWM- 

Some examples, the first of which is the instance of axiom JU for k = £ = 0: 

#1;\\#1 = sc #0;\\#1, 
#2; a; #5; b; +c; \\#3 = sc #4; a; #2; b; +c\\#3, 
+a; #2; +b; #2; -c; #2; \\#4 = sc +a; #0; +b; #0; -c; #0; \\#4 

= sc +a;#0;+b;#0;-c;\\#4. 

Here the right-hand sides are second canonical if -forms. We state without proof that in L sc , 
second canonical if -forms have a unique minimal representation in terms of their number 
of instructions (cf. the last example above). As a consequence, structural congruence for 
if -programs is decidable. 

Theorem 2. Two if -programs P and Q are structurally congruent if, and only if, 
L S c V P = Q. 

Proof. Trivial: restricting to if, soundness and completeness of the proof system L sc are 
captured by its direct relation with PGA (minimal second canonical if -forms are unique). □ 
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Let X = m; . ■ ■ ; Wn+fc; then 









= \ j— n,X\ if j > n+fc, 




= S if Uj =!, 


Ij>*I 


= a o j'+l, X| if Uj = a 


li>*l 


= |i+l,X|<a>|j+2,X| 




= \j+2,X\<a>\j+l,X\ 




= D if Uj = #0, 




= \ j+m,X\ if Uj = =ff=m 



Table 4: Equations for thread extraction on K, where u% 6 W ', k G N and j, n,m £ N + 
3.3 Thread extraction in 

Thread extraction can be defined in a straightforward way on second canonical if-forms. We 
write 

for the thread extraction of ^-program X. Of course, structural congruent if-programs define 
identical threads. In the case that a i^-program contains no repeat instruction, we define 

lm; . . . ; u^k = [iti; . . . ; u„; #0; 

To define behavior extraction on second canonical K- forms u\\ . . . ; u n +fe; we use an 

auxiliary function 

|j,ui;...;u n+fc ;\\#n| 

where the number j refers to the position of instructions: 

[ui; . . . ; u„ +fe ; \\#nj x = |1, u x ; . . . ; u n+k ; \\#n| 
and |j, ui; . . . ; u n ; \\#«.| is defined by the case distinctions in Table[4j 

We state without proof the following result, implying that for X-program X, \X~\k agrees 
with thread extraction on PGA- programs (see Section 12751 and recall that 

|tti; ...;u n \ = \m; ...;m„; (#Q) w | 

can be inferred from the equations in Table (2) ■ 

Theorem 3. Let k > 0, n > and let u\ range over the primitive instructions. Then 
fui; . . . ; u k ;u k+1 ; . . . ; u fe+ „; \\#n]x = ■■■\u k ; {u k+1 ; . . . ; u fc+7l ) w |. 
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4 Discussion and conclusions 



We provided an algebraic theory of an ASCII representation of program algebra PGA by re- 
placing its repetition operator by a family of repeat instructions where the counter k 
ranges over the natural numbers larger than 0. The resulting semigroup L admits representa- 
tion by first canonical L-forms (axiom (O). We distinguished a kernel K of L and provided 
axioms for single-pass congruence, structural congruence and thread extraction. As of yet, 
we see no other application for K than that it confirms the point of view that a program is 
"a sequence of instructions" , and that it highlights that the converse is not true: not each 
sequence of instructions can be called a "program" . 

The contents of this paper adheres to the philosophy of PGA, i.e., 

1. The mathematical object denoted by a program is a SPI, while the program's meaning 
is a thread to be extracted from that SPI, and 

2. A SPI is the sort of object for which single-pass execution is the preferred operational 
semantics. 



Our main motivation to undertake this research is the idea that in the setting of PGA the 
notion of programming languages or program notations as defined in [5] should be reconsidered. 
In particular, it is questionable whether the program notation PGLA, which is in fact L as 
defined in this paper, is an appropriate example of a programming language. The criterion 
formulated in [3j to use this terminology is the existence of a projection function pgla2pga 
(PGLAtoPGA) that maps any PGLA-program (L-sequence) to a PGA-program. In fact, 
PGLA was considered a first and basic candidate for a PGA-based programming language 
and served as the basis for a tool set and programming environment for program algebra [BJ. 

However, a typical property of the projection function pgla2pga is that if the repeater of 
a first canonical form in L is "too large" , it adds #0- instruct ions to obtain a first canonical 
form in K . This solution does not combine in an elegant way with jumps, as witnessed by the 
following examples where we abbreviate |pgla2pga(A)| by |A| pg ; a (as is done in [5]): 

|a;#l;\\#3| Mia = |a; #1; #0; \\#3| pg(a = |(a; #1; #0) w | =aoD, 
|a; #2;\\#3| raia = |a;#2;#0;\\#3| pgia = |(a;#2;#0)-| =a°°, 



and, more generally, 

|a; #*;\\#3| M , o = |(ai#fc;#0) 6 



a°° if k mod 3 = 2, 
a o D otherwise. 



So in PGLA's projection of |a; \\#3| pff i a , deadlock D either arises from the added #0 
instruction or from the interplay with \\#3 and the original jump instruction #k. This we 
now consider rather arbitrary, and we prefer to view K, a proper subset of PGLA, as the 
programming language that is closest to PGA. 

Our conclusion is that we consider PGA the more basic theory for providing semantics for 
sequential programming (instead of L or K), if only because there is no canonical interpretation 
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of L outside K. This agrees with the point of departure adopted in _3j: a programming 
language is a pair [E, </)) with E a set of expressions (the programs) and </> a projection 
function to PGA. In particular, the projection function pgla2pga maintains its definitional 
status: it agrees with the original definition of PGLA and at the same time with K as the 
(proper) subset of its programs. We note that K satisfies a property that is often seen in 
imperative programming: if 

P\Q 

is a program, then P and Q need not be (well-formed) programs. This is not the case in PGA 
where decomposition of concatenated programs is valid, or in the setting of SPIs. 

Finally, a word on related work: PGA can be viewed as a theory of instruction sequences 
with our kernel K or PGLA as one of its many representations. Unfortunately, we have not 
been able to identify any pre-existing theory by other authors to which this work can be 
related in a convincing manner. The phrase instruction sequence seems not to play a clear 
role in the theory of programming. The software engineering literature at large features many 
uses of this phrase, but only in a casual setting. 
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